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INVARIANTS  AND  STRUCTURED  SINGULAR  VALUES 


Introduction 


The  theme  of  the  ONR-sponsored  research  program  at  Honeywell  for  the  last  eight  years  was 
the  mathematical  theory  of  robust  control.  The  theoretical  novelty  that  fueled  the  technical 
effort  was  the  structured  singular  value  (SSV).  The  result  was  a  new  theory  of  robust  control 
based  on  the  structured  singular  value.  Control  analysis  and  design  tools  based  on  this  theory 
have  gained  widespread  industry  acceptance. 

Quoting  Gary  Balas  of  MUSYN: 

To  gauge  the  control  community’s  interest  in  these  topics,  a  short  course  offered 
in  September  1989  by  MUSYN  Inc.  titled,  "Theory  and  Applications  of  Robust 
Multivariable  Control,"  was  attended  by  over  45  people  from  industry,  academia 
and  government  laboratories.  The  course  was  offered  again  in  March,  1990  at 
NASA  Langley  Research  Center  to  25  NASA  and  Air  Force  engineers.  These 
researchers  are  interested  in  applying  these  techniques  to  topics  such  as:  flight 
control,  flutter  suppression  and  vibration  attenuation  of  flexible  structures.  In  the 
summer  of  1990  the  course  will  be  offered  in  Cambridge,  England  Delph,  Neth¬ 
erlands  and  Pasadena,  California.  It  is  believed  that  over  200  people  will  attend 
the  robust  multivariable  short  course  in  1990. 

Currently,  control  design  techniques  and  ^.-analysis  and  synthesis  methods  are 
being  used  to  design  flight  control  systems,  vibration  attenuation  control  laws  for 
flexible  structures,  and  missile  autopilots.  Johns  Hopkins  University  Applied  Phy¬ 
sics  Laboratory  (JHUAPL),  China  Lake  Naval  Weapons  Center  and  Dahlgren 
Naval  Weapons  Center  are  applying  such  methods  to  design  of  robust  control 
laws  for  missile  autopilots.  JHUAPL,  in  addition,  is  investigating  the  application 
of  the  the  ^-synthesis  methodology  to  the  design  of  gain-scheduled  autopilots  and 
guidance  and  navigation  algorithms. 

In  our  experience  at  Honeywell’s  Systems  and  Research  Center  over  the  last  five  years,  we 
have  seen  these  methods  applied  to  numerous  aerospace  control  system  analysis  and  design 
problems:  involving  the  F- 15  STOL  DEMO  vehicle.  Space  Shuttle,  and  NASP  to  name  just  a 
few.  In  current  control  design  applications  the  SSV  is  the  key  ingredient  that  insures  needed 
multivariable  robustness  properties  of  the  feedback  control  laws. 

The  Structured  Singular  Value  concept  was  in  its  infancy  when  this  ONR-sponsored  program 
began  more  than  eight  years  ago.  The  rapid  growth  of  the  concept  during  those  early  years  led 
to  the  ONR/Honeywell  Workshop,  a  three  day  event  in  October  of  1984.  Featured  speakers  at 
that  Workshop  were  (among  others)  Gunter  Stein,  John  Doyle,  Bruce  Francis.  The  notes  from 
the  presentations  of  these  three  speakers  constituted  the  official  set  of  Workshop  Notes.  Those 
Notes  constituted  Volume  1  of  this  final  report. 

It  is  satisfying  to  see  a  theoretical  concept  grow  and  find  practical  applications  as  successfully 
as  the  SSV  concept  has  over  the  last  8  years.  This  ONR  sponsored  program  has  provided 
valuable  support  for  the  basic  mathematical  research  effort.  The  successful  collaboration  we 
have  had  between  top-notch  mathematics  and  control  experts  in  academics  and  industry  is  not 
easy  to  keep  alive  without  some  source  of  government  research  funds. 

When  it  came  time  to  write  the  final  report,  there  were  several  possibilities  that  came  to  mind 
concerning  what  to  write  about.  Ideas  about  documenting  the  history  of  the  SSV  development, 
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a  serious  study  of  robust,  practical  control  law  design,  a  (finally)  comprehensible  development 
of  the  SSV  theory  with  a  survey  and  comparison  of  accumulated  results,  etc. 

After  several  unsuccessful  attempts  to  write  on  several  of  those  sensible  and  useful  topics,  it 
became  clear  that  the  final  report  was  going  to  be  —  more  mathematical  theory  (this  is  a 
research  contract,  after  all). 


Technical  Report  Summary 

The  underlying  mathematical  theory  was  and  still  remains  an  evolving  bundle  of  concepts 
and  techniques,  often  discussed  and  written  about  by  the  investigators,  never  completely  for¬ 
malized.  This  report  is  primarily  devoted  to  a  presentation  of  a  relatively  small  piece  of  the 
overall  SSV  theory  and  the  simplest  part  at  that:  the  diagonal,  unrepeated  complex  block 
problem.  One  would  have  thought  that  after  eight  years  of  research  the  simplest  part  of  the 
theory  would  have  been  worked  out  to  everyone’s  satisfaction.  In  fact,  fcr  the  most  simple, 
nontrivial  (i-problem  (2x2  diagonal  unrepeated  blocks)  it  has. 

After  the  notation  in  section  1  of  the  report,  we  present  to  everyone’s  satisfaction  (or,  at  least 
the  author’s  satisfaction)  the  2x2  diagonal-block  p- theory.  If  all  we  wanted  was  a  statement 
and  proof  of  the  result  we  could  have  made  that  section  much  shorter.  The  greater  effort  put 
into  that  simple  problem  was  part  of  a  (not  very  well  concealed)  plan  to  bring  invariant  theory 
into  the  picture. 

To  see  the  invariants,  and  the  role  they  play  in  the  problem,  the  theory  must  be  polished  to  a 
very  fine  resolution.  Every  parameter  of  the  problem  must  be  accounted  for.  After  going 
through  the  theory  in  this  new  way  we  started  to  see  things  a  little  differently.  That  was  the 
plan,  because  our  secret  goal  was  to  solve,  at  last,  the  four-block  diagonal  p-problem  by  a 
method  that  could  be  implemented  efficiently  on  a  computer.  That  simple  problem  was  the 
first  p-problem  on  the  hit-list  after  Doyle  proved  his  remarkable  theorem  for  the  three-block 
case  in  1982.  The  author  tried  his  hand  at  it  and  failed.  It  remained  unsolved  throughout  the 
duration  of  this  contract.  Clearly,  we  were  missing  something,  so  a  fresh  approach  looked  like 
a  good  idea. 

We  solved  the  four-block  diagonal  p-problem.  We  did  it  while  trying  to  use  invariant  theory. 
We  developed  a  reasonable  algorithm,  implemented  and  tested  it  on  a  computer,  and  it 
seemed  to  work  (with  no  other  way  to  find  die  exact  answer,  how  do  we  know  if  it  works?  - 
it  did  agree  with  Packard’s  lower  bound  to  within  a  couple  of  percent).  Plots  of  the  results  are 
shown  at  the  end  of  Section  3.  Then  we  started  to  write  the  details  of  the  proofs.  The  theory 
started  to  change  (but,  remarkably,  the  algorithm  did  not).  The  theory  below  (still  in  flux)  is 
the  current  version. 

Many  of  the  results  presented  here  were  known  earlier,  but  there  are  several  ideas  that  seem 
original.  A  partial  list  of  highlights  is: 

1)  Theorem  3.1  (with  Last  Minute  Remarks  at  the  end  of  section  4): 

This  theorem  is  a  canonical  form  description  of  the  diagonal,  non-repeated  block 
(i-problem.  Though  not  made  truly  canonical  until  the  report  was  almost  finished, 
in  its  incomplete  form  it  laid  a  foundation  for  the  geometric  analysis.  In  its  com¬ 
plete  form  it  allows  immediate  classification  of  the  3-block  ^.-problem  within  a 
space  depending  (generically)  on  8  real  parameters  (the  four-block  mu  problem 
appears  to  depend  on  23  real  parameters). 

2)  Theorem  3.3 
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This  theorem  led  to  the  development  of  the  computational  algorithm  that  can  now 
be  used  to  compute  p(M)  exactly  for  the  4-diagonal  block  problem.  It  provides  a 
set  of  polynomial  equations,  equations  that  are  very  easy  to  understand  and  work 
with  (if  not  always  easy  to  solve),  and  shows  how  the  solutions  to  that  set  of 
equation  give  solutions  to  the  ^.-problem. 


3)  Constructive  algorithm  3.1 

There  is  a  general  result  from  classical  elimination  theory  stating  that  a  system  of 
n+1  homogeneous  polynomial  equations  in  n  unknowns  can  be  solved  --  that 
method  can  (in  principle)  be  used  to  solve  the  (i-problem  given  the  result  of 
Theorem  3.3.  The  constructive  algorithm  we  use  though  not  formalized  yet,  is  a 
form  of  elimination  specially  tailored  for  the  polynomial  system  at  hand.  There 
might  be  better  methods  available,  but  this  one  was  readily  understood,  easy  to 
code,  numerically  robust,  etc. 

4)  Loose  End  4.1 

This  rambling  discussion  describes  the  very  recent  efforts  toward  applying  invari¬ 
ant  theory  to  improve  our  understanding  of  structured  singular  values,  with  an  eye 
toward  efficient  computational  methods.  In  the  absence  of  any  new  or  practically 
useful  results  to  show  for  our  efforts  so  far,  we  have  focused  attention  on  explain¬ 
ing  why  we  approached  the  problem  from  this  viewpoint  and  what  we  can  realist¬ 
ically  hope  to  achieve.  The  final  observation  made  at  the  deadline.  Observation 
4.4,  explains  how  we  intend  to  proceed. 


The  treatment  here  is  far  from  a  comprehensive  study  of  structured  singular  values.  For  those 
who  want  a  broader  reference,  the  best  we  know  is  the  1988  AFOSR  Report  "Robust  Control 
of  Multivariable  and  Large  Scale  Systems,"  by  Andy  Packard  and  John  Doyle. 

The  parameter  spaces  used  here  are  the  same  as  those  used  in  past  studies.  The  new  idea  in 
this  report  is  to  look  at  the  problem  from  a  global  perspective,  to  find  all  the  points  that  look 
like  they  could  be  |i(M),  and  determine  how  to  pick  the  right  one.  Previous  studies  quickly 
eliminated  all  of  the  bogus  p-like  points,  but  they  were  forced  to  work  locally  to  do  so  (to  the 
credit  of  their  inventors,  the  local  techniques  have  led  to  some  good  global  results). 

The  characterization  of  the  set  Ssmg  in  Theorem  3.2  and  the  correspondence  between  those 
points  and  the  real  points  on  a  singular  complex  variety  given  in  Theorem  3.3  turns  out  to  be 
natural  (at  least  the  author  still  thinks  so).  The  problem  with  this  global  perspective  is  the 
large  number  of  complex  points  on  that  variety  that  eat  up  computer  time  when  one  has  to  use 
in  order  to  carry  them  along.  For  the  three  and  four  block  problems  the  dimensions  work  out 
so  that  the  global  evaluation  is  feasible,  but  already  for  dimension  5  we  are  not  sure  if  this 
approach  is  still  practical. 

The  final  section  on  loose  ends  discusses  and  sometimes  improves  upon  the  shortcomings  of 
results  in  earlier  sections.  A  major  part  of  that  section  is  devoted  to  an  informal  discussion  of 
what  we  are  trying  to  do  with  invariants.  There  is  no  telling  when  we  will  succeed  in  solving 
the  problems  stated  in  that  section,  but  the  success  we  have  had  implementing  code  to  solve 
these  problems  confirms  that  these  problems  are  solvable. 

In  short,  we  solved  the  problem  we  set  out  to  solve  but  probably  not  much  more.  In  the  pro¬ 
cess  we  have  come  to  a  better  understanding  of  the  complexity  of  the  p-problem,  and  we  have 
some  less  than  certain  approaches  to  extending  the  computable  results  out  to  6  or  8  blocks  or 
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so.  The  picture  revealed  in  going  through  the  theory  is  fascinating,  with  inevitable  ties  to  alge¬ 
braic  geometry  and  classical  invariant  theory.  We  hope  the  reader  will  find  something  of 
interest  in  the  presentation. 

Programmatic  Overview 

This  program  was  people  working  on  theory,  talking  to  each  other,  and  writing  papers  for 
publication. 

This  report  was  written  by  the  Honeywell  Principal  Investigator,  but  the  ideas  presented  (at 
least  the  good  ones)  are  primarily  the  results  of  consultants  who  worked  on  the  program  and 
of  colleagues  at  Honeywell  Systems  and  Research  Center.  Names  of  those  whose  efforts  con¬ 
tributed  direcdy  this  program  are: 

Consultants,  Visitors, ... 


1)  John  Doyle  -  Star  of  the  team,  SSV  Inventor 

2)  Andy  Packard 

3)  Allen  Tannenbaum 

4)  Pramod  Khargonekar 

5)  Mike  Safonov 

6)  Jim  Freudenberg 

7)  Bruce  Francis 

Honeywell  Personnel  and  Expatriots 

1 )  Gunter  Stein 

2)  Gary  Hartmann 

3)  Mike  Barrett 

4)  Dale  Enns 

5)  Jim  Krause 

6)  Kathryn  Lenz 

7)  Chester  Chu 

8)  Mike  Elgersma 

9)  Dan  Bugajski 

10)  Joe  Wall 

11)  An  Harvey 

12)  Blaise  Morton  -  Honeywell  PI 

Apologies  to  those  who  played  a  role  and  are  not  listed  here.  It  is  hard  to  remember  all  the 
contributors  to  an  eight-year  old  program.  It  is  also  difficult  to  keep  track  of  who  gets  credit 
for  everything.  You  guys  know  what  you  did. 
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There  are  many  other  publications  that  were  either  completely  or  partially  funded  by  this  con¬ 
tract.  We  are  willing  to  provide  a  more  complete  list  if  desired. 


For  another  novel  approach  to  the  |i  problem  (not  partially  funded  by  this  contract),  there  is  a 
very  fine  piece  of  recent  work  co-authored  by  one  of  our  co-investigators: 

Bercovici,  H.,  C.  Foias,  and  A.  Tannenbaum,  "Structured  Interpolation  Theory,"  Preprint. 

One  of  the  interesting  results  in  this  paper  is  a  new  proof  of  Doyle’s  three-block  theorem  - 
that  the  upper-bound  is  equal  to  |i  in  the  case  of  a  three-block  A  structure. 
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1.  Notation 


Let  M  be  a  complex  n  x  n  matrix  and  denote  by  A  the  set  of  block  diagonal  matrices: 


► 

Ai 

0 

0 

o' 

■* 

0 

a2 

0 

0 

0 

0 

•  •  • 

0 

0 

0 

0 

Ak 

where  each  Aj  is  a  complex  mj  x  mj  matrix  ,  mj  +  •  •  •  +  m^  =  n.  In  this  structure  one  or 
more  blocks  on  the  diagonal  may  be  repeated  —  an  important  special  case  is  the  set  of  scalar 
matrices,  denoted  5  Idn,  that  commutes  with  all  n  x  n  matrices. 

In  this  note  the  scalar  matrices  are  the  only  structure  considered  with  repeated  blocks.  This 
restriction  to  the  non-repeated  case  allows  a  simplification  of  notation:  when  we  want  to  indi¬ 
cate  a  specific  structure  we  affix  the  values  m^  •  *  •  ,mk  as  superscripts  to  A  (e.g.  A1,1,1,1  is 
the  set  of  diagonal  4x4  matrices).  When  the  specific  structure  is  not  important  we  simply 
use  the  symbol  A  and  place  the  burden  on  the  reader  to  remember  that  a  set  of  positive 
integers  mj  summing  to  n  is  tacitly  assumed. 

Associated  with  A  is  the  set  UA  of  block-diagonal  unitary  (i.e.  U*U  =  Id)  matrices  that  are 
contained  in  A.  For  example,  when  A  is  the  set  of  diagonal  matrices  we  have: 


(2) 


where  0j  is  a  real  number. 

We  are  interested  in  solving  the  following  maximization  problem:  find 

max  {  p  (X  M)  I  X  £  U4  )  (3) 

where  p  is  the  spectral  radius  function.  The  solution  to  this  maximization  problem  was 
shown  by  Doyle  to  be  the  structured  singular  value  of  M,  denoted  p(M),  for  the  structure  A. 

There  are  two  special  structures  for  which  p  can  be  identified  with  standard,  important  func¬ 
tions.  The  spectral  radius  of  M  is  the  function  p(M)  associated  with  the  structure  5  Id„,  and 
the  maximum  singular  value  of  M  is  the  function  p(M)  associated  with  the  structure  An.  These 
two  special  cases  of  the  function  p  are  extreme  in  that  p(M)  for  any  other  structure  lies 
between  those  two  values. 

Another  special  case  arises  when  A  is  the  set  of  diagonal  n  x  n  matrices:  then  k  =  n  and  each 
mj  is  1.  It  is  this  special  case  we  will  discuss  below. 
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2.  The  Simplest  Non-Trivial  Example 


We  will  consider  the  general  (but  unrepeated-block)  structure  in  a  later  section.  The  anxious 
reader  can  skip  this  section,  but  it  is  probably  easier  to  read  this  section  first.  Here  we  see 
how  the  general  approach  works  in  the  simplest  non-trivial  example:  |i(M)  associated  with  the 
structure  A1,1. 

Let  v  denote  a  general  complex  2-vector  (v  e  C2) 


v  = 


vi 

.v2  ‘ 


(4) 


The  set  UA  consists  of  2x2  matrices  X  of  the  form: 


X  = 


(5) 


For  each  X  in  UA  let  us  examine  the  eigenvalue  problem  for  the  product  matrix  XM.  If  the 
complex  scalar  X  is  an  eigenvalue  of  XM  there  must  be  a  nonzero  v  such  that  equation  6 
holds: 


XM  v  =  X  v  .  (6) 

To  solve  the  maximization  problem  in  equation  3  we  need  to  determine  the  largest  IXI  that 
can  arise  for  any  X  in  UA  and  nonzero  v  satisfying  equation  6. 

In  the  past,  two  types  of  approach  have  been  applied  to  this  problem: 

Approach  1:  Start  at  X  =  Id2  and  solve  the  eigenvalue  problem  for  XM  =  M.  Next,  for  j  = 
1,2,  consider  Xj  obtained  by  incrementing  0,  by  a  small  step-size  d0j.  Solve  the  eigenvalue 
problems  for  X-M,  compute  max  iXlj  in  both  cases,  and  increment  01?  02  by  taking  a  small 
step  in  the  direction  of  greatest  first-order  increase  in  IX I .  Iterate  until  a  maximum  is  found. 

Approach  2:  Using  the  parametric  representation  of  UA  given  in  equation  5,  form  the  product 
matrix  XM  symbolically  as  a  function  of  0lt  02  and  the  entries  of  M.  Write  down  the  analytic 
expression  for  the  two  eigenvalues  of  XM  as  functions  of  these  same  variables.  Differentiate 
with  respect  to  0j,  02  the  expressions  for  the  magnitudes  of  the  two  eigenvalues  at  each  point 
and  set  these  partial  derivatives  equal  to  zero  to  obtain  a  set  of  equations  to  be  solved  for 
0j,  02  in  order  to  determine  the  cntical  values  of  IX I .  Solve  all  these  equations  for  01(  02  and 
substitute  the  solutions  back  into  the  eigenvalue  formulas.  The  largest  IXI  arising  from  the 
finite  set  of  points  considered  should  be  die  desired  maximum. 

These  two  approaches  can  be  used  but  neither  one  provides  a  satisfactory  solution  to  the  gen¬ 
eral  problem.  The  problem  with  the  first  approach  is  the  possibility  of  local  maxima  less  than 
the  true  maximum.  Examples  have  been  constructed  where  multiple  local  maxima  do  exist. 
Consequently,  gradient  search  techniques  do  not  have  guaranteed  convergence  properties. 

The  second  approach  has  the  problem  that  it  does  not  generalize  to  a  tractable  algorithm  for 
problems  involving  larger  matrices. 
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The  underlying  weakness  of  both  approaches  is  the  reliance  on  analytic  properties  of  the 
eigenvalue  problem  depending  on  the  X-parameter  in  equation  6.  In  our  approach  we  solve 
the  maximization  problem  of  equation  3  without  making  direct  use  of  equation  6.  We  can 
sidestep  the  eigenvalue  problem  because  it  gives  more  information  than  we  really  need.  The 
key  question  is  how  big  we  can  make  IX I  before  equation  6  has  no  solution  for  any  X  in  Ua- 
It  is  possible  to  reformulate  the  problem  in  such  a  way  that  the  the  vector  v,  the  phase  of  X, 
and  the  matrix  X  do  not  play  a  direct  role.  In  its  new  form  the  problem  can  be  solved. 

For  the  structure  A1,1  the  reformulated  problem  is  stated  in  terms  of  a  pair  of  2x2  Hermitian 
matrices  H!(r),  H2(r)  depending  on  a  real  parameter  r.  These  two  matrices  are  defined  in 
dyadic  terms  by: 


j  r  [0  1]  (7) 


where  my  are  trie  components  of  the  matrix  M.  The  following  lemma  motivates  the  construc¬ 
tion  of  H(r). 

Lemma  1:  There  exist  X  e  UA,  nonzero  v,  and  complex  X  such  that  IX I  =  r  satisfying  equation 
6  if  and  only  if  there  is  a  nonzero  vector  y  e  C2  such  that 


H!(r)  = 


m 


li 


m12 


[mu  ml2]  “ 


r  [1  0]  ,  H2(r)  = 


m21 

m22 


[m2]  1TI22]  - 


y*H1(r)y  =  0  and  y*  H2(r)  y  =  0  .  (8) 

Proof  of  Lemma  1 :  First  suppose  a  nonzero  y  satisfies  equation  8.  The  components  yh  y2  of  y 
satisfy: 


lmnyi  +  mny2|2  = r  !yi|2  ^yi  +  m22y2i2  =  r  !y2|2 


(9) 


Taking  square  roots  of  both  sides  of  both  equations  and  setting  X  =  Vr,  we  find  that  equation  9 
is  equivalent  to  the  existence  of  real  numbers  0j,  02  such  that 


0 


mll  m12 


0 


|m21  m22j 


yi 

,y2 


(10) 


The  expression  in  equation  10  is  the  component  form  of  equation  6. 

Conversely,  suppose  equation  10  is  satisfied.  It  is  easy  to  see  that  the  equations  in  9  are 
satisfied  if  r  is  set  equal  to  IX I2.  Then  equation  8  is  also  satisfied.  Lemma  1  is  proved. 

In  view  of  lemma  1  we  may  restate  the  maximization  problem  of  equation  3  as  follows:  deter¬ 
mine  the  largest  real  number  r  such  that  a  nonzero  vector  y  exists  satisfying  equation  8.  This 
reformulation  leads  immediately  to 

Question  1 :  Given  a  pair  of  Hermitian  forms  H1  and  H2,  when  does  there  exist  a  nonzero  vec¬ 
tor  y  such  that  y*  H1  y  =  0  and  y*  H2  y  =  0  are  satisfied  simultaneously? 

We  will  answer  question  1  below  in  the  statement  of  lemma  2.  First  we  make  two  simple 
observations. 
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Recall  that  the  eigenvalues  of  a  Hermitian  matrix  are  real.  We  call  a  Hermitian  matrix  definite 
if  all  its  eigenvalues  are  nonzero  and  have  the  same  sign  (i.e.  all  are  positive  or  all  are  nega¬ 
tive). 

Observation  1:  If  either  form  H1  or  H2  is  definite  then  no  solution  vector  y  exists. 

Returning  to  the  definitions  of  the  matrices  H'(r)  in  equation  7,  however,  we  see  that  both 
matrices  are  indefinite  by  construction.  Each  has  one  positive  and  one  negative  eigenvalue  if 
no  row  of  M  is  zero.  So  observation  1  provides  no  useful  information,  but  it  does  lead 
directly  to 

Observation  2:  If  there  is  a  real  vector  t  =  [t1,  tj  such  that  H(t)  =  t!  H1  + 12  H2  is  definite 
then  no  solution  vector  y  exists. 

This  second  observation  will  be  useful.  In  particular,  if  r  is  chosen  sufficiently  large,  we  see 
that  H!(r)  +  H2(r)  will  be  negative  definite.  Thus  observation  2  can  be  used  to  place  an  upper 
bound  on  the  size  of  p(M).  Better  yet,  it  leads  to 

Lemma  2:  If  H(t)  is  indefinite  for  all  real  vectors  t,  then  there  exists  a  nonzero  vector  y  such 
that 


y*  H1  y  =  0  and  y*  H2  y  =  0 


are  satisfied  simultaneously. 

Proof  of  Lemma  2:  It  suffices  to  consider  the  one-parameter  family  of  matrices 

F=  (tH1  +  H2  I  te  R)  .  (11) 

A  Hermitian  matrix  has  real  eigenvalues;  its  determinant  is  the  product  of  those  eigenvalues. 
It  follows  that  a  2  x  2  Hermitian  matrix  is  definite  if  and  only  if  its  determinant  is  positive. 

Let  us  suppose  that  every  matrix  in  F  is  indefinite.  We  conclude: 

for  all  real  t  det(t  H1  +  H2)  <0  (12) 


There  are  now  two  cases. 

Case  1:  det(H!)  =  0  —  then  the  function  of  t  on  the  left-hand  side  of  inequality  12  is  an  affine, 
non-positive  function  that  therefore  must  be  constant  (i.e.  independent  of  t).  These  conditions 
can  arise  for  a  pair  of  Hermitian,  2x2  matrices  only  if  H1  is  a  scalar  multiple  of  H2  ,  in 
which  case  the  conclusion  of  lemma  2  follows  (take  any  nonzero  y  such  that  y*  H2  y  =  0. 

Case  2:  det(H!)  nonzero  --  change  the  basis  of  C2  (if  necessary)  so  that  the  matrices  of  H1 
and  H2  are: 


H1  = 


1 

0 


0 

-1 


hn  h12 
h2i  h 22 


(13) 


for  some  real  numbers  hn,  h^  and  complex  numbers  hJ2  =  h2i- 
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In  these  coordinates  the  inequality  12  is  equivalent  to 


(hu  +  h22)2  ~  4hi2h2i  ^  0  (14) 

Also,  any  vector  y  satisfying  y  H1  y  =  0  has  coordinates  yj  =  ae  y2  =  ae  2  for  some  real 
numbers  0lt  02  and  a.  We  want  to  find  a  nonzero  vector  of  this  form  that  also  satisfies 
y*  H2  y  =  0.  We  might  as  well  set  a  =  1. 

The  proof  will  be  complete  if  we  can  find  two  real  numbers  0j  and  02  such  that: 


0  =  [e'ie\  e~i02] 


hu  h12 

eie‘ 

h2j  h^ 

ciQ\ 

—  hjj  +  h^  +  2Re(e 


1(02-0, 


^12) 


(15) 


But  equation  (15)  can  be  solved  if  and  only  if  the  inequality  in  14  holds. 


Lemma  2  is  proved. 


Remark  1:  As  a  consequence  of  Lemma  2  and  the  preceding  observations,  we  have  proved  the 
following: 

Let  H1,  H2  be  a  pair  of  indefinite  2x2  Hermitian  matrices.  Then  one  of  two  exclusive  alter¬ 
natives  holds: 

Alternative  1:  The  matrix  tt  H1  +  t2  H2  is  definite  for  some  pair  of  real  numbers  tlf  t2. 
Alternative  2:  There  is  a  nonzero  vector  y  such  that  y*H‘y  =  0  for  i  =  1,2. 


Remark  2:  The  proof  of  Lemma  2  does  not  generalize  to  higher  dimensions,  nor  does  the  con- 
clusion  if  n  is  bigger  than  3.  For  higher  dimensions  we  need  to  use  a  more  sophisticated 
approach  as  described  in  Section  3. 

Let  us  conclude  the  analysis  of  the  A1,1  mu_problem. 

For  the  matrices  HV)  and  H2(r)  defined  in  equation  8,  define  the  set  F(r): 

F(r)  =  {t  H!(r)  +  H2(r)  I  t  e  R}  (16) 

We  have  already  observed  that  H](r)  +  H2(r)  is  definite  for  r  sufficiently  large.  Define  Ro  to 
be  the  infimum  of  the  set  of  R  such  that  for  all  r  >  R  there  exists  a  pair  of  real  numbers  tj,  t2 
such  that  tj  HJ(r)  +  t2  H2(r)  is  definite.  R0  is  finite  and  non-negative.  By  this  construction, 
any  real  combination  of  H^(R0)  and  H2(Ro)  must  be  indefinite,  so  alternative  2  of  remark  1 
above  must  hold.  Furthermore,  for  any  r  >  Rq  alternative  1  must  hold  (hence  2  cannot):  it  fol¬ 
lows  that  R0  is  the  solution  to  the  optimization  problem,  hence  equal  to  li(M)2  (recall  that  in 
equation  10  we  set  r  =  IX!2  and  H(M)  is  the  maximum  IA.I). 

How  do  we  compute  Rq?  First,  we  know  from  what  we  have  already  seen  that  there  is  a  non¬ 
zero  vector  y  such  that  y*  H‘(Ro)  y  =  0.  Consequently,  there  is  a  unitary  matrix  U0  (not  neces¬ 
sarily  in  UA)  such  that  for  all  t  e  R: 
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U*  (t  hHRq)  +  H2(Ro))  u  = 


(17) 


0 

t  h^i  +  h221 


t  h1^  +  h2i2 
t  h^22  h^22 


Fortunately,  we  can  say  even  more  about  the  family  F(R^) 

Lemma  3:  Let  H*(r),  H2(r),  F(r)  and  Rq  be  defined  as  above.  For  some  real  t°  the  matrix 


H°  =  t^^Ro)  +  H2(Ro) 


(18) 


is  semi-definite,  with  at  least  one  zero  eigenvalue. 

Proof  of  Lemma  3:  For  any  e  >  0  the  family  F(Rq  +  £)  contains  a  definite  matrix.  There  is  no 
difficulty  in  finding  a  compact  set  B  of  real  numbers  (B  depends  on  HJ(r)  and  H2(r)  for  r 
between  R0  and  R0  +  1)  such  that  for  any  e  between  0  and  1  a  number  t*  in  B  satisfies 
t^R0  +  e)  +  H2(Rq  +  e)  is  definite. 

Take  a  sequence  {e;}  decreasing  to  0,  and  select  an  accompanying  sequence  {tEi}  in  B  such 
that  the  corresponding  matrix  is  definite.  Let  t°  be  any  limit  point  of  the  sequence  te',  and 
define  H°  =  t°  H!(R0)  +  H2(Rq)  as  in  equation  18.  Every  open  set  of  2  x  2  Hermitian 
matrices  containing  H°  also  contains  a  definite  matrix,  yet  H°  is  not  definite.  The  eigenvalues 
are  continuous  functions,  so  H°  must  be  semi-definite;  because  it  is  not  definite  it  has  at  least 
one  zero  eigenvalue. 

Lemma  3  is  proved. 

With  lemma  3  in  hand,  let  us  reexamine  the  significance  of  equation  17.  We  now  know  that 
when  t°  of  lemma  3  is  substituted  for  t  in  17  the  determinant  vanishes.  In  fact,  the  function 

p(t)  =  det(t  HkRo)  +  H2(Rq))  (19) 


must  have  a  zero  of  second  order  at  t  =  t°.  This  means  that  the  derivative  polynomial 
is  also  zero  at  to-  Summarizing  this  last  discussion,  we  make 


d£ 

dt 


(t) 


Observation  3:  At  r  =  Rq  three  things  happen: 

1)  all  forms  in  F(R0)  are  indefinite 

2)  for  some  t°  the  form  t°  H^Rq)  +  H2(Rq)  is  semidefinite 

3)  when  the  form  is  semidefinite,  the  determinant  vanishes  at  t°  to  second  order  in  t. 


From  these  facts  we  see  that  Rq  can  be  computed  by  the  following  algorithm. 

Step  1:  Compute  the  coefficients  of  the  polynomial  function  p^kr)  of  two  variables  defined  by 

Pj(U)  =  det(t  H](r)  +  H2(r))  (20) 


Step  2:  Compute  the  coefficients  of 
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(21) 


p2(t,r)  =  —(t/) 

Step  3:  Compute  the  coefficients  of  the  polynomial  function  q(r)  obtained  by  eliminating  t 
from  pj  and  p2. 

Step  4:  Find  the  largest  positive  real  root  of  the  polynomial  q(r)  (if  no  positive  real  root  to 
q(r)  exists,  take  0  for  the  answer). 

Claim:  The  answer  obtained  in  the  fourth  step  is  Rq,  the  square  of  the  desired  value  p(M). 
There  are  two  cases  to  consider  in  verifying  this  last  claim. 

Case  1:  mu  =  0  =  m12  ~  then  the  only  nonzero  entry  in  H](r)  is  -r  in  the  1,1  spot  and 
detfH^r))  is  zero  for  ail  r.  Consequently,  the  polynomial  pj(:/)  is  at  most  first-order  in  t  so 
p2(t,r)  is  independent  of  t,  and  R*  satisfies  p2(t,R*)  =  0  if  and  only  if  lm22l2  =  R*.  Then  the 
2,2  entries  of  H^R.)  and  H2(R,)  are  both  zero,  hence  alternative  2  of  remark  2  is  satisfied  at  r 
=  R..  If  r  >  R,,  the  2,2  entry  of  H2  is  negative  and  so,  for  large  enough  t,  the  determinant  of 
t  H1  +  H2  is  positive.  Thus  alternative  1  holds  for  r  >  R*,  hence  R*  =  Rq  =  p(M)2. 

Case  2:  mn  or  m12  nonzero-  then  det(H!(r))  is  negative  for  any  r  >  0  so  Pi(t^)  is  quadratic 
in  t.  If  R.  is  any  positive  real  root  of  the  polynomial  q(r)  then  pj(t,R*)  is  of  the  form 

p,(t,R.)  =  -k*2  (t  -  u)2 


where  the  leading  coefficient  is  negative  because  deKF^CR.))  is  negative.  It  follows  that  at  R* 
the  determinant  is  non-positive  for  all  t,  hence  t  H’(R*)  +  H2(R*)  is  indefinite  for  all  L  Thus 
Ro  cannot  be  less  that  R*,  the  largest  positive  solution  of  q(r).  On  the  other  hand,  Rq  is  also 
be  a  root  of  q(r),  hence  R.  =  Rq  =  p(M)2  as  claimed. 


Remark  3:  For  this  low  dimensional  example  the  answer  can  be  computed  without  going 
through  the  formal  procedure  just  outlined.  When  the  size  of  the  problem  gets  bigger,  how¬ 
ever,  the  complexity  of  the  computations  becomes  much  greater  and  a  more  systematic 
approach  (e.g.  elimination  theory)  is  required.  Though  the  details  are  different  for  the  higher 
dimensional  problem  discussed  in  the  next  section,  the  algorithm  to  compute  |i(M)  is  basically 
the  same.  A  numerical  example  illustrates  the  four-step  approach  for  the  structure  A1,1. 

Numerical  Example:  Let  M  be  the  2  x  2  matrix 


M  = 


1.0 

0.1 


10.0 

i 


The  two  matrices  H](r)  and  H2(r)  are: 


HJ(r) 


(1.0 -r)  10.0 
10.0  100.0 


H2(r)  = 


0.01 

— 0.  li 


O.li 

(1.0 -r) 


(22) 


(23) 


Step  1  for  example:  The  polynomial  p^tj)  is: 
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Pi(tj)  =  det(t  H\t)  +  H2(r)) 

=  -100.0  r  t2  +  ((1  -  r)2  +  1)  t  -  0.01  r 


Step  2  for  example:  The  polynomial  p2(t,r)  is: 


p2(t,r)  =  -^-(t,r) 

=  -200.0  r  t  +  (r2  -  2.0  r  +  2.0) 

Step  3  for  example:  Setting  equation  25  equal  to  zero  and  solving  for  t 

(r2  -  2.0  r  +  2.0) 

200.0  r 

Obtain  q(r)  by  substituting  equation  26  into  equation  24  and  clearing  fractions: 

q(r)  =  (r2  —  2.0  r  +  2.0)2  -  4.0  r2 
=  (r2  +  2.0)  (r2  -  4.0r  -  2.0) 

Step  4  for  example:  The  largest  root  of  q(r)  is: 

Ro  =  2.0  +  Vl0 


Finally 


p(M)  =  VR o  =  ^ 
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3.  The  Theory  for  Diagonal  Matrices 
In  this  section  we  analyze  the  case  of  diagonal,  unrepeated  blocks. 

Let  M  be  a  complex  n  x  n  matrix  and  denote  by  A1, '  ‘ the  set  of  n  x  n  diagonal  matrices: 


.1  _ 


*• 

■5, 

0 

0 

o' 

•« 

0 

§2 

0 

0 

0 

0 

0 

0 

0 

0 

8„ 

d 

(3.1) 


where  each  5j  is  a  complex  number.  Here  we  do  not  allow  repetition  of  any  of  the  5j  symbols 
on  the  diagonal. 

Recall  that  UA  is  the  set  ofnxn  unitary  matrices  in  A1, '  ‘  ‘ ,1:  i.e.  for  j  =  1,  •  •  •  ,n  I8jl  =  1. 
For  each  X  in  UA  we  consider  the  eigenvalue  problem  for  the  product  matrix  XM.  If  the  com¬ 
plex  scalar  X  is  an  eigenvalue  of  XM  there  must  be  a  nonzero  v  such  that  equation  3.2  holds: 


XM  v  =  X  v  . 


(3.2) 


As  in  section  2,  we  need  to  determine  the  largest  IA.I  that  can  arise  for  any  X  in  UA  and 
nonzero  v  satisfying  equation  3.2. 

We  now  reformulate  the  problem  in  terms  of  n  Hermitian  n  x  n  matrices  H3(r),  •  •  •  ,  H”(r) 
depending  on  a  real  parameter  r.  These  matrices  are  defined  in  dyadic  terms  by: 


H3(r)  =  - 


H"(r)  =  - 
r 


mu* 

m12* 

1 

0 

min* 

" 

[mu  m12  *  •  •  mlnl  - 

0 

mnl* 

mn2* 

0 

mnn 

Ki1^  •  •  •  %1- 

0 

1 

[1  0  •  •  •  0]  , 


[0  0 


1] 


(3.3) 


Lemma  3.1  Let  R0  be  the  largest  real  number  for  which  there  exists  a  nonzero  vector  y  satis¬ 
fying: 


y*  Hj(r)  y  =  0  j  =  1,  •  •  •  ,  n. 


(3.4) 
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Then  |i(M)  =  ^Ro- 

Proof  of  lemma  3.1:  This  simple  computation  is  analogous  to  the  one  for  the  special  case  (see 
Lemma  1  of  Section  2). 

Remark  3.1  The  matrices  H-i(r)  defined  here  differ  from  those  in  Section  2  by  a  factor  of  — . 

-  r 

The  idea  is  to  normalize  the  HJ  matrices  so  that  the  negative  eigenvalue  is  -1  for  any  r.  The 
disadvantage  of  this  change  is  that  the  H-i(r)  matrices  are  no  longer  defined  at  r  =  0.  Because 
we  are  primarily  interested  in  those  matrices  for  which  n(M)  is  greater  than  zero  we  accept 
this  disadvantage  for  the  sake  of  normalization. 

The  matrices  W(r)  are  generically  rank  2,  but  there  are  special  conditions  where  the  rank 
drops  to  0. 

Definition  3.1  We  will  call  the  matrix  M  degenerate  if  some  row  is  zero  off  the  main  diagonal 
(i.e.  for  some  i  and  all  j  not  equal  to  i,  my  =  0). 

It  is  convenient  to  exclude  the  degenerate  matrices  from  the  general  analysis  below  so  we  take 
care  of  them  now. 

Lemma  3.2:  Suppose  the  matrix  M  is  degenerate,  and  that  row  i  is  zero  away  from  the  diago- 
nal.  Then  |i(M)  is  the  larger  of  the  two  numbers: 

1)  lm„l  or 

2)  H(Mi) 

where  M;  is  the  (n  -  1)  x  (n  -  1)  matrix  obtained  by  deleting  the  i*  row  and  column  of  M. 

Proof  of  Lemma  3.2:  Let  M,  v  and  X  satisfy  equation  equation  3.2.  Two  possibilities  arise  as 
follows: 

1)  if  the  vector  v  has  v;  nonzero  then  I  A,  I  =  ImJ, 

2)  if  the  vector  v  has  Vj  =  0  then  the  n-1  vector  obtained  from  v  by  deleting  v;  satisfies  3.2  for 
the  matrix  M;  and  the  same  X. 

We  know  that  |i(M)  is  at  least  as  large  as  Im^l  because  m^  is  an  eigenvalue  of  M  and 
(t(M)  £  j5(M).  The  only  way  that  n(M)  could  be  larger  is  if  some  v  with  Vj  =  0  satisfies  equa¬ 
tion  3.2  with  some  X  of  magnitude  larger  than  Irn^l.  In  that  case  |i(M)  is  equal  to 

Lemma  3.2  is  proved. 

If  the  matrix  Mj  obtained  from  M  is  still  degenerate,  lemma  3.2  can  be  applied  repeatedly. 
Repeated  application  will  eventually  lead  to  the  computation  of  p  for  a  non-degenerate  matrix 
or,  if  M  is  diagonal,  eliminate  the  |i  computation  completely  (for  diagonal  M,  p(M)  =  j5(M)). 

For  the  rest  of  this  section  we  assume  M  is  non-degenerate. 

All  the  matrices  FP(r)  are  rank  2.  Let  us  assume  there  is  a  nonzero  vector  y  satisfying  equa¬ 
tion  3.4.  Then  there  is  a  unitary  transformation  U  such  that  the  1,1  entries  of  the  matrices 
U*  IF(r)  U  are  0.  In  theorem  1  below  we  characterize  the  structure  of  the  matrices  U*  HJ(r)  U. 

Theorem  1  (Canonical  Form):  Suppose  there  is  a  nonzero  vector  y  satisfying  equation  3.4. 
Then  there  is  a  unitary  matrix  U  such  that  for  each  j  the  matrix  HJ(r)  is  in  the  following  form: 
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U*  HJ(r)  U  = 


0  k>  (ccj*  -  f3>*) 
kJ  (aJ  -  P)  aJaJ*  - 


(3.5) 


The  n  complex  numbers  k*,  the  n  (n-l)-vectors  and  (S1  in  3.5  are  related  to  U  and  M  by  the 
equations: 


k1 

ei0MU 

i - 

• 

0 

rH 

VF 

kn 

pn* 

kn  an* 

(3.6) 


for  some  real  diagonal  n  x  n  matrix  0. 

Proof  of  Theorem  1 :  Constructive  —  by  assumption  there  is  a  unit  length  vector  k  such  that 


„ie 


M 

VF 


* 

- 

k1 

k1 

= 

kn 

kn 

•  « 

.  . 

(3.7) 


The  unit  vector  k  can  be  expanded  (nonuniquely)  by  n  complex  vectors  (3*  of  size  (n-1)  to  a 
n  x  n  unitary  matrix  U: 


U  = 


k1  p1* 
kn  Pn* 

The  vectors  aJ  of  size  n-1  are  then  given  by  the  following  equation: 


(3.8) 


eie  M  U 

VF 


k1  a1* 


k"  an* 


(3.9) 


Equations  3.8  and  3.9  are  the  identities  in  equation  3.6.  From  here  it  is  an  easy  task  to  verify 
3.5:  the  j*  row  of  the  right  hand  side  of  3.9  can  be  multiplied  on  the  left  by  its  conjugate 
transpose  to  get 

lyl2  i?  a^* 
kj  a->  aia?* 

which  is  one  part  of  of  3.5  (the  matrix  e10  drops  out).  Similarly  the  j*  row  of  the  right  hand 
side  of  3.8  can  be  left-multiplied  by  its  conjugate  transpose  to  get: 
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Ikil2  i? 
kj  pj  pipr 

When  these  last  two  matrices  are  subtracted  the  result  is  the  right  hand  side  of  equation  3.5. 
Theorem  1  is  proved. 


Remark  3,2  Except  for  special  cases  of  M  the  vector  k  is  unique  up  to  phase.  The  vector  k 
can  be  identified  with  the  vector  y  that  was  assumed  to  exist  in  the  hypotheses.  There  was 
some  freedom  in  the  selection  of  the  parameters  (P  —  all  other  solutions  are  obtained  by  multi¬ 
plying  U  on  the  right  by  a  general  matrix  of  the  form: 


Tu  = 


1 

0 


(3.10) 


where  Tn-1  is  a  unitary  matrix  of  size  n-1  x  n-1.  The  aJ  vectors  are  uniquely  defined  once 
the  other  parameters  are  fixed. 


The  representation  of  the  HJ(r)  matrices  given  in  Theorem  1  applies  to  any  value  r  =  IX. I2  for 
which  the  ^.-eigenvalue  equation  3.2  has  a  solution.  We  are  interested  in  characterizing 
extremal  solutions  of  3.2.  More  precisely,  we  want  to  determine  those  values  Rk  that  are  local 
maxima  of  those  values  of  r  for  which  equation  3.2  can  be  solved. 

Before  embarking  on  our  extremal  set  analysis  there  are  two  topics  to  discuss.  First,  a  sum¬ 
mary  of  our  goals  for  the  rest  of  this  section. 

We  are  headed  for  a  theorem  that  characterizes  the  solution  to  the  structured  singular  value 
problem  for  a  generic  set  of  matrices.  What  we  mean  by  generic  is  the  second  topic  in  this 
digression,  for  now  we  concentrate  on  describing  the  general  approach.  The  first  step  is  to 
identify  a  real-algebraic  set  S  contained  in  Cn  that  contains  the  extremal  point  we  seek.  The 
set  S  has  a  simple  definition  but  a  complicated  structure.  At  a  generic  point  S  has  a  neighbor¬ 
hood  diffeomoiphic  to  R",  but  there  are  exceptional  points  comprising  the  singular  set  of  S 
where  this  is  not  true.  It  is  in  the  singular  set  of  S  that  our  answer  lies. 

Our  final  answer  will  be  a  set  of  polynomial  equations  that  can  be  solved  to  determine  every 
point  in  the  singular  set  of  S.  This  set  of  polynomial  equations  is  derived  from  a  set  of  singu¬ 
larity  relations  on  the  Jacobian  of  the  defining  equations  of  S  that  must  hold  at  a  point  y0  if  it 
is  in  the  exceptional  set.  The  canonical  form  in  Theorem  1  is  used  to  construct  the  polynomial 
equations  for  the  singular  set  from  the  singularity  relations.  Elimination  theory  is  used  to 
generate  a  polynomial  in  the  single  variable  r.  The  largest  real  solution  of  this  polynomial  is 
(generically)  the  square  of  p(M). 

As  mentioned  at  the  beginning  of  this  digression,  the  argument  below  will  apply  only  to  gen¬ 
eric  matrices  M.  For  our  work  here,  an  argument  will  be  said  to  apply  generically  if  it  is  true 
for  an  open  (standard  Euclidean  norm  topology),  dense  subset  of  the  space  of  complex  n  x  n 
matrices.  The  precise  conditions  for  which  arguments  apply  will  be  stated  in  each  case.  The 
results  presented  here  were  derived  only  recently  and  we  did  not  have  time  to  look  for  more 
general  proofs.  Presumably,  the  special  cases  will  be  worked  out  later  and  a  more  comprehen¬ 
sive  treatment  might  be  found. 
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Returning  to  the  extremal  problem  we  introduce  more  notation. 
Let  Sr  be  the  following  set: 


Sr  =  {y  e  Cn  I  y*y  =  1  ,  y*  H i(r)  y  =  0  j  =  1,  •  •  •  ,  n)  .  (3.11) 


We  want  to  find  the  largest  value  of  r  such  that  Sr  is  nonempty.  Define  the  set  S  to  be  the 
union  of  Sr  over  all  r  >  0.  Suppose  we  have  a  curve  y(r)  mapping  r  in  some  interval  [r0,  q] 
into  S  such  that  y(r)  e  Sr.  We  suppose  that  the  mapping  is  differentiable  at  all  points  where  it 
is  defined.  What  can  be  said  about  a  value  Rk  of  r  beyond  which  this  curve  cannot  be  contin¬ 
ued? 


At  all  points  r  for  which  the  curve  y(r)  is  defined  differentiably,  the  tangent  vector 
satisfy: 


dr 


must 


djL 

dr 


y  +  y 


,d£ 

dr 


=  0 


HJ(r)  y  +  y*  Hj(r)  +  y* 
dr  dr 


dHJ(r) 

dr 


y  =  0 


(3.12) 


j  =  l. 


Altogether,  this  set  of  equations  imposes  n+1  real-linear  conditions  on  the  tangent  vector 

dr 

that  lives  in  a  2n-dimensional  real  vector  space.  The  only  way  that  there  could  be  a  problem 
in  finding  a  solution  for  a  given  value  y(r)  is  if  the  equations  are  overdetermined  and  incon¬ 
sistent.  Theorem  2  characterizes  the  local  maximum  points  Rk. 

Theorem  2: 


1)  Suppose  y0  is  a  point  in  Sro.  If  the  space  of  vectors  w  E  Cn  satisfying 

w*  y0  +  y0*  w  =  0  w*  HJ(r0)  y0  +  y0*  HJ(r0)  w  +  y0* — y0  =  0  (3.13) 

dr 

j  =  1,  •  •  •  ,  n 

is  exactly  (n-l)-dimensional  then  r0  is  not  a  local  maximum  point. 

2)  The  local  maximum  points  Rk  are  contained  in  the  singular  set  Ssmg  defined  as  follows: 

Ssulg  is  the  set  of  those  values  of  r  for  which  there  exists  a  set  of  real  numbers  (q  •  •  •  .qj 

not  all  zero  and  yk  £  Sr  such  that: 


.1  tj  H»(r)  yk  =  0 


and  tj  kj  nonzero  for  some  j. 


(3.14) 


Proof  of  Theorem  2: 


1)  The  existence  of  an  (n-l)-dimensional  space  of  solutions  w  of  equation  3.13  at  y0  implies  a 
local  manifold  structure  for  S  (locally  n -dimensional)  near  y0.  A  curve  through  y0  can  be 
found  for  values  of  r  in  the  interval  [r0,  r0  +  T|]  for  some  small,  positive  r\  that  stays  in  S.  It 
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follows  that  r0  is  not  a  local  maximum  point. 

2)  The  alternative  to  case  1  can  arise  only  for  values  r  =  Rk  for  which  there  is  a  point  yo  in 
C1  where  the  Jacobian  of  the  mapping  from  Cn  to  Rn+1  defining  SRk  is  not  full  rank.  Let  us 
suppose  that  there  is  a  linear  relation  among  the  n+1  gradient  functions.  We  use  the  canoni¬ 
cal  form  coordinates  provided  by  Theorem  1  for  a  general  point  in  S.  In  these  coordinates,  the 
point  y0  and  the  general  tangent  vector  w  are: 


1 

aj  +  ibj 

0 

w  = 

a2  +  ib2 

0 

\ 

The  first  relation  w*  y0  +  yo  w  =  0  is,  in  this  notation: 


2at  =  (aj  -  ibj)  +  (at  +  ibj)  =  0 


(3.15) 


(3.16) 


or,  in  other  words,  wj  is  pure  imaginary.  This  condition  is  the  infinitesimal  form  of  the 
requirement  that  ||y(r)||  have  a  constant  value:  the  curve  stays  in  S2"-1,  the  unit  2n-l  sphere  in 
C".  The  space  S  is  contained  in  S2n_1  x  R  where  the  last  R  factor  parametrizes  the  r-values. 

There  are  n  more  linear  relations  imposed  on  w:  for  each  j  we  have 


[(at  -  ibj)  (a2  -  ib^ 


(an  "  ibn)l 


0 

kW  -  pi) 


(3.17) 


[0  (kJ  (a**  -  pi*))] 


at  +  ibj 

a2  +  i&2 

an  +  ibn 


+  [10 


,  dHJ(Rk) 

0]  u  - —  u 

dr 


1 

0 

0 


=  0  . 


dHJ(Rk) 

The  expression  — - -  is  computed  in  the  original  coordinates  of  equation  3.2.  We  have 

dr  * 

chosen  our  coordinates  so  that  the  derivatives  of  U  and  U  do  not  appear:  the  matrices  U  are 
held  fixed  (at  their  values  for  the  canonical  form  at  Rk)  as  the  parameter  r  is  varied  continu¬ 
ously  about  Rk.  For  now  we  treat  the  last  summand  on  the  left  hand  side  of  3.17  as  a  general 
(indeterminate)  vector.  To  simplify  notation  define: 


7)  =  -[10  •  •  •  0] 


d  U*  Hi(Rk)  U 
dr 


1 

0 

0 


(3.18) 


Then  (3.17)  simplifies  to: 
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2  Re 


(3.19) 


k'  (a1’  -  0'*) 

a2  +  it>2 

Z1 

k"  (a"'  -  P"’) 

an  +  ^n 

Z" 

If  the  rows  of  the  complex  coefficient  matrix  [kJ  (a-’*  -  (P*)]  are  linearly  independent  over  the 
real  numbers,  then  for  any  Z  there  exists  a  set  of  real  numbers  aj  and  bj  solving  equation  3.19. 
Conversely,  if  the  rows  are  linearly  dependent  over  the  reals  then  equation  3.19  can  be  solved 
only  if  Z  lies  in  the  kernel  of  the  real  row  vector  that  annihilates  the  complex  coefficient 
matrix.  The  local  manifold  structure  of  S  is  guaranteed  only  at  those  points  where  the 
columns  are  independent  over  the  reals.  This  leads  us  to  define: 

Singularity  Conditions:  let  r  be  in  S  and  suppose  yo  is  in  Sr.  Choose  a  transformation  U  so 
that  U*  FT(r)  U  is  in  the  canonical  form  of  Theorem  1.  Let  Z  be  the  real  vector  in  Cn  given 
by  3.18.  If  r  is  in  Ssing  then  there  must  be  some  set  {tj,  •  •  •  ,  tj  of  nonzero  real  numbers 
such  that: 


k1  (a1*  -  P1*) 


[ti  •••  u 


tjkj  nonzero  for  some  j. 


(3.20) 


Equation  3.14  ^is  an  equivalent  form  of  the  singularity  conditions  in  the  original  coordinates 
(without  the  U*  U  terms,  and  conjugate-transposed). 

Theorem  2  is  proved. 


Remark  3.3:  The  conditions  imposed  in  3.16  are  independent  of  those  imposed  in  3.19. 
Geometrically  this  makes  sense  --  the  relations  in  3.19  are  the  infinitesimal  form  of  the  rela¬ 
tions  y*  HJ(r)  y  =  0  that  are  invariant  under  multiplication  by  a  complex  scalar  and  so  define  a 
relation  on  the  projective  space  of  lines  in  Cn.  Equation  3.16,  on  the  other  hand,  is  the 
infinitesimal  form  of  the  normalization  of  vectors  by  their  length.  Any  (nonzero)  projective- 
space  solution  of  the  equations  y*  HJ(r)  y  =  0  can  be  normalized  to  have  unit  length,  so  it  is 
consistent  that  the  infinitesimal  constraints  should  be  independent. 


Remark  3.4:  As  mentioned  in  Remark  3.2  the  canonical  form  of  Theorem  1  is  not  unique. 
Because  that  form  is  used  in  the  definition  of  Ssmg,  we  should  verify  that  the  set  Ssing  is  well 
defined,  independent  of  choices.  ^  A  different  choice  of  form  would  result  in  multiplying  the 
complex  coefficient  matrix  i?  (aJ*  -  P>*)  on  the  right  by  a  unitary  matrix  T„_i  as  in  equation 
10  (actually,  the  phase  of  the  matrix  T„_j  may  be  shifted  by  a  a  uniform  factor  due  to 
nonuniqueness  of  the  vector  k).  Transformations  of  this  type  cannot  change  the  singularity 
status  of  this  matrix  —  the  singularity  test  depends  on  the  existence  of  a  real  vector  that 
annihilates  the  complex  coefficient  matrix  when  applied  on  the  left;  while  the  nonuniqueness 
is  represented  by  multiplication  with  a  nonsingular  matrix  on  the  right.  Thus  Ssing  is  well 
defined,  independent  of  choices  made  in  the  construction. 


The  space  Ssmg  has  a  direct  analytic  tie  to  the  original  ^.-problem.  Consider  a  value  r  for 
which  Sr  is  nonempty.  From  the  equations  3.6  it  is  easy  to  verify  that: 
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ei0M 

V? 


0 

0 


a1*  -  p1* 
an*  -  pn* 


(3.21) 


It  is  clear  from  this  expression  that  Vr  is  a  valid  eigenvalue  for  the  equation  3.2  because  the 
rank  of 


ei0M 

V? 


is  less  than  n.  For  such  a  matrix  there  is  always  a  vector  w*  such  that 


* 

w 


eieM 

V? 


=  0  . 


(3.22) 


The  vector  w*  is  the  left  eigenvector  for  the  problem  3.2.  If  r  is  in  Ssuie,  however,  then  by 
3.20  and  3.21  there  is  a  set  of  real  numbers  {tj,  •  •  •  ,  tjJ  such  that  tj  1?  is  nonzero  for  some 
j  and 


[t,  k1 


-Id* 


U  =  0. 


(3.23) 


Recall  that  k  is  the  right  eigenvector  for  M  from  Theorem  1.  An  alternate  characterization  of 

Ssmg  as  f0n0ws; 

Alternate  Singularity  Characterization:  The  point  r  is  in  Ssmg  if  there  is  a  real  n  x  n  matrix  © 
such  that  the  right  eigenvector  k  and  left  eigenvector  w*  for  e10M  have  components  with  con¬ 
jugate  phase,  i.e.  w^  k-i  is  real  for  j  =  1,  •••  ,  n. 

So  far  we  have  worked  at  reformulating  the  problem  but  we  have  not  made  much  progress 
toward  solving  it.  Before  proceeding  to  the  solution  we  pause  to  introduce  a  generic  condi¬ 
tion  on  the  class  of  M. 
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GENERIC  CONDITION:  We  assume  that  for  all  real  r  the  rank  of 


eieM 


(3.24) 


for  any  real  matrix  ©  is  never  less  than  n-1. 

Remark  3.5:  Another  way  of  stating  the  generic  condition  is  to  require  that  the  matrix  e,eM  is 
free  of  repeated  eigenvalues  for  any  real  ©.  Experience  with  numerical  aspects  of  the  standard 
eigenvalue  problem  suggests  that  this  restriction  on  the  problem  might  simplify  analysis.  The 
possibility  of  nontrivial  Jordan  cells  is  eliminated  if  repeated  eigenvalues  are  not  allowed.  The 
non-repeated  root  condition  holds  for  an  open,  dense  subset  of  M  and  so  is  a  generic  condi¬ 
tion.  It  will  be  assumed  to  hold  for  M  in  the  analysis  that  follows. 


It  is  worthwhile  to  see  what  the  generic  condition  implies  about  the  canonical  form. 
Observation  3.1  The  generic  condition  is  exactly  the  condition  that  the  matrix: 


eieM  TJ 

0  a1*  -  p1* 

VF  -,d"j 

II 

0  an*  -  pn* 

(see  equation  3.21)  is  rank  n-1.  Equivalently,  the  n  x(n-l)  matrix  faJ*  -  is  full  rank.  The 
direction  of  the  vector  [tj  k\  •  •  •  ,  t„  k*11  that  appears  in  equation  3.23  as  a  left  eigenvector 
is  therefore  uniquely  determined  (up  to  a  complex  scalar  factor)  by  the  matrix  [aJ*  -  p>*]. 


Theorem  3:  Let  M  be  a  generic  matrix,  and  consider  the  polynomial  function  in  the  real  vec¬ 
tor  t  =  {tlt  •  •  •  ,^1  and  — : 

r 


P(ti,  •  '  ’  ,  t„,  ~)  =  det(t,  HJ(r)  +  •  •  •  +  t„  H"(r))  (3.25) 

If  r0  is  in  Ssmg  then  the  function  p  is  a  polynomial  that  is  not  identically  zero.  Furthermore, 
there  exists  some  point  {tf,  •  •  •  ,  t°)  such  that 

Pdf,  •  •  •  ,  t®,  — )  =  0,  (3.26a) 

ro 

|fdf,  •  •  •  ,  t°,  — )  =  0  j  =  1,  •  •  •  ,  n  .  (3.26b) 

dtj  r0 

Furthermore,  from  3.26a  and  3.26b  there  can  be  constructed  a  polynomial  g(— )  such  that 
every  real  number  r0  in  SSU18  is  a  solution  of 
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g(-)  =  0 
ro 


(3.27) 


Conversely,  if  r0  is  a  real  root  of  the  polynomial  g  and  if  equations  3.26a  and  3.26b  are 
satisfied  nonvacuously  by  some  real  set  of  values  (tf,  •  •  •  ,t^)  then  r0  is  in  S. 

Finally,  if  Ssing  is  nonempty,  ^i(M)  =  VRm«  where  Rmax  is  largest  real  number  in  Ssing.  If 
Ssmg  is  empty,  p(M)  =  0. 

Proof  of  Theorem  3: 

First  we  want  to  show  that  if  r0  is  in  Ssmg  there  is  a  nonzero  real  vector  t°  such  that  p  and 

1  J 
vanish  at  (t°,  — ).  By  Theorem  1  there  is  a  unitary  U  such  that 

ro 


,  .  0  1?  (a3*  -  p3*) 

U  HJ(r0)  U  =  jy  (aj  _  ajaj*  _  jjjpj* 

By  Theorem  2  there  is  a  real  nonzero  vector  t°  such  that 


(3.28) 


k1  (a1*  -  m 

[t?  ■  ■  $)_  -  0  . 

kn  {a?  -  pn‘) 


(3.29) 


Fixing  the  parameter  r0,  define  H(t)  by 


H(t)  =  tj  HVq)  +•*•+!„  H"(r0) 


(3.30) 


Then 


U*  H(t)  U  =  n 


0  I  tj  1?  (a)*  -  pi*) 

L  tj  kj  (a)  -  pi)  Z  tj  (aiaJ*  -  pipi*) 


(3.31) 


The  1,1  entry  in  3.31  vanishes  identically,  the  first  row  and  column  vanish  at  t  =  t°.  Expand¬ 
ing  the  determinant  about  the  first  row  it  is  clear  that  the  polynomial 


p(t,  — )  =  det(H(t))  =  det(U*  H(t)  U) 


(3.32) 


vanishes  to  first  order  at  t  =  t°  (saying  that  H(t)  vanishes  to  first  order  at  t°  is  another  way  of 

saying  that  H(t°)  =  0  and  — — (t°)  =  0).  It  is  not  difficult  to  show  (using  the  generic  assump- 

dt3 
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tion  on  M  and  equation  3.31)  that  the  unique  (up  to  scale)  t°  satisfies  the  polynomial  equa¬ 
tions  3.26a  and  3.26b  nonvacuously  (see  below).  We  have  verified  that  each  Tq  in  Ssu18  gives 
rise  to  a  nontrivial  solution  of  equations  3.26a  and  3.26b. 


We  now  want  to  show  the  converse.  Suppose  r0  is  a  value  of  r  such  that  the  real  vector  t° 
satisfies  equations  3.26a  and  3.26b  nonvacuously  (nonvacuously  means  that  at  least  one 
nonzero  t°-monomial  is  multiplied  by  a  nonzero  coefficient  of  p). 

Compute  the  n-1  x  n-1  minor-polynomials  of  H(t)  with  respect  to  some  row  --  say  the  first 
row.  Let  fj(t)  denote  the  i^  minor  polynomial.  Define  the  vector  polynomial  function  F(t): 


F(t)  = 


(3.33) 


From  det(H(t°))  =  0  it  follows  that 


det(H(t0)) 

0 

H(to)F(to)  =  ...  =0 


(3.34) 


From  the  condition  that  the  determinant  of  H(t)  vanishes  at  t°  to  first  order,  we  know  that 


[  F*(t)  H(t)  F(t)j  ll  =  to  = 

^-(t°)  H(t°)  F't°)  +  F*(t°)  ~(t°)  F(t°)  +  F*(t°)  H(t°)  ~(t°)  =  0  . 


(3.35) 


By  equation  3.34  and  its  conjugate  transpose,  the  first  and  third  terms  of  the  sum  in  3.35  van¬ 
ish.  Evaluating  the  middle  term,  we  find: 


F*(t°)  HJ  F(t°)  =  0  . 


(3.36) 


which  is  exactly  the  form  of  the  homogeneous  system  of  equations  defining  S.  The  argument 
is  not  complete,  however,  because  all  the  first  order  minors  could  vanish  at  t°  --  then  F(t°)  is 
the  zero  vector  and  equation  3.36  reduces  to  a  vacuous  assertion. 

In  case  all  the  minors  of  H(t)  vanish  at  to  ,  take  repeated  partial  derivatives  of  the  function 
F(t)  with  respect  to  an  appropriate  set  of  q,  creating  a  sequence  of  vector  polynomials 
Fp(t)  =  F(t),  Fj(t),  F2(t),  •  •  •  that  all  vanish  to  first  order  at  t°  until,  finally,  Fn(t)  vanishes  at 
r  but  no  longer  to  first  order.  This  process  works  because  of  the  assumption  that  tg  satisfies 

dFn  n 

the  equations  3.26a  and  3.26b  (it  would  not  be  true  otherwise).  Suppose  — — (t  )  is  nonzero. 

dtj 

Then 
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(3.37) 


0  = 


-|V)  Fn(t°)  +  H(t°) 


5Fn 

3tj 


(t°) 


3Fn  n  dFn 

so  H(t)~ — (t)  vanishes  at  t  and  — —  (to)  is  nonzero. 

dtj  atj 


9Fn 


Repeat  the  computations  of  equations  3.35  and  3.36  with  — —  in  place  of  F. 

dtj 

We  conclude  that  r0  is  in  S. 


To  complete  the  proof  of  Theorem  3.3  we  explain  how  the  polynomial  g  is  constructed.  The 
system  of  equations  3.26a  and  3.26b  is  a  set  of  n+1  simultaneous  equations  in  the  n+1  unk¬ 
nown  variables  tj  ,  •  •  •  ,  t„ ,  — .  In  fact,  these  polynomials  are  not  independent  because  of 
the  Euler  identity: 


n  Pdf.  •  '  ■  ,«ft-  X  tip-ft?,  •  •  '  .  t»,  f )  (3.34) 

J=1  r0 

Consequently,  any  solution  of  equations  3.26b  will  automatically  satisfy  equation  3.26a,  so  we 
really  have  only  n  equations  to  solve.  The  polynomials  are  homogeneous  in  the  t-variable, 
however.  For  each  solution  (t,r)  with  t  nonzero  there  is  a  1 -parameter  family  of  solutions 
(Xt,  r)  for  all  real  X.  The  solution  for  each  fixed  r  is  a  real  projective  variety  in  the  t-space,  so 
the  system  of  polynomials  depends  on  n-1  independent  affine  parameters. 

The  general  technique  for  solving  this  type  of  problem  is  elimination  theory,  as  described  in 
[Van  der  Waerden].  The  resulting  polynomials  can  have  very  large  degree,  however,  and  it 
might  be  better  to  take  advantage  of  the  structure  of  the  polynomial  system.  The  technique 
used  to  test  the  theory  on  the  3  and  4  block  cases  is  shown  below  as  Constructive  Algorithm 
3.1. 

All  the  constructions  are  now  complete.  Denote  by  V  the  the  set  of  real  solutions  rj  of  the 
resultant  polynomial  g(— ).  In  the  first  part  of  the  proof  we  showed  that  Ssulg  is  a  subset  of 

V.  In  the  second  part  we  showed  that  V  is  a  subset  of  S.  The  largest  value  Rm(0t  of  S  lies  in 
Ssing,  therefore  V  is  a  finite  set  subset  of  S  containing  R^.  For  r  >  R^  we  have  ST  is 
empty,  but  SRnu  is  nonempty.  From  Lemma  3.1,  if  Ssing  is  nonempty  then  |i(M)  =  >/Rmax.  If  S 
is  empty  then  R^^  is  not  defined  and  p(M)  =  0. 

Theorem  3  is  proved. 


Constructive  Algorithm  3.1:  In  the  proof  of  Theorem  3.3  we  referred  to  a  constructive  algo- 
rithm  that  takes  advantage  of  the  structure  of  the  polynomial  system.  We  illustrate  the  algo¬ 
rithm  here  with  the  example  from  the  3-block  case. 

The  most  general  form  of  the  determinant  polynomial  in  the  3-block  scalar  case  is: 


det(H(t))  =  C210  +  ^01  +  c120  lll2  + 


(3.35) 
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clll  +  c102  lll32  +  c021  l2l3  +  c012  l2l3 


The  coefficients  c^  are  cubic  polynomials  in  — .  The  three  partial-derivative  functions  are 
easily  computed: 

9H  ->  , 

=  2c2io  *1*2  +  2c201  tjt3  +  c120  t2  +  cul  t2t3  +  c102t3  ,  (3.36a) 

9H 

=  c2io  t2  +  2c120  tjt2  +  cm  tjt3  +  2c021  t2t3  +  c012  t3  (3.36b) 

3H 

=  C201  li2  +  cm  lil2  +  2ci02tit3  +  c02i  l2  +  teonhh  ■  (3.36c) 

Our  goal  is  to  determine  a  relation  on  the  coefficients  that  holds  whenever  the  system  gives  a 
nontrivial  point  in  Ssmg.  The  approach  is  to  construct  a  12  x  12  matrix  out  of  the  cijk  such 
that  the  determinant  vanishes  whenever  the  r  that  they  depend  upon  lies  in  Ssmg. 

The  procedure  for  constructing  the  matrix  is  as  follows.  For  each  of  the  three  derivative  poly¬ 
nomials,  form  the  product  with  four  separate  monomials: 


T1  =  {ll2>  tlt2»  tlt3»  ^3} 

(3.37a) 

T2  =  {tjt2,  tjt3,  tj2,  t2t3) 

(3.37b) 

T3  =  {tjt2,  t]t3,  t2t3,  t2 ) 

(3.37c) 

The  result  is  12  polynomials  that  are  generated  by  the  12  monomials: 

B  =  {tf t2,  tft3,  tj2^2,  t(2t2t3,  t2t3,  tjt2. 

(3.38) 

tit2t3,  t1t2t32,  tjt3 >  t2t3,  t2t3,  t2t3} 

Form  the  coefficient  matrix  for  the  12  polynomials.  If  there  is  a  solution  to  the  original  three 
equations  for  which  the  function  H(t)  does  not  vanish  identically  (at  least  2  of  the  three  values 
tj,t2,t3  must  be  nonzero)  then  the  determinant  of  the  12  x  12  coefficient  matrix  will  vanish. 
Conversely,  if  the  determinant  vanishes  then  there  is  some  solution  of  the  original  three  equa¬ 
tions  with  the  property  that  at  least  2  of  the  three  values  tj,t2,t3  are  nonzero. 

To  generate  the  polynomial  g  of  Theorem  3.3,  substitute  the  appropriate  polynomial  expres¬ 
sion  into  the  Cyk  functions  that  appear  in  the  expression  for  the  determinant  of  the  coefficient 
matrix 

We  solved  the  four-block  case  by  a  similar,  explicit  scheme.  The  result  was  (for  one  example) 
a  68  x  68  matrix  that  genetically  had  rank  62  (we  also  tried  a  degenerate  (definition  3.1) 
example  that  genetically  had  rank  49). 

Remark  3.7:  When  testing  the  algorithm  we  did  not  explicitly  evaluate  the  polynomial  g  -- 
there  was  no  need.  Instead,  we  wrote  a  computer  program  that  generated,  for  a  given  value  of 
r,  the  coefficient  matrix  corresponding  to  that  value.  The  value  of  r  was  then  allowed  to  vary. 
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by  small  increments,  over  a  specified  range  of  values.  At  each  value  of  r  an  appropriate  singu¬ 
lar  value  of  the  coefficient  matrix  was  computed  using  the  LINPACK  program  dsvd.  A  plot 
was  then  drawn  for  visual  inspection.  Some  sample  plots  are  shown  in  the  accompanying 
figures  on  the  next  page. 

The  plots  on  the  next  page  show  the  results  when  the  exact  4-block  p  computation  algorithm 
is  run  on  an  example.  The  example  matrix  is  the  sample  test  matrix  M  in  the  family  of 
matrices  in  [Robust  Control  of  Multivariable  and  Large  Scale  Systems,  Final  technical  Report 
to  AFOSR,  Contract  No.  F49620-86-C-0001,  March  23,  1988  by  Andy  Packard],  page  39 
with  parameters  defined  on  page  40.  The  interesting  aspect  of  this  example  is  that  the  upper 
and  lower  bounds  do  not  agree,  so  no  technique  for  finding  the  exact  value  of  p  was  known 
until  the  constructive  algorithm  3.1  was  developed. 

Using  the  constructive  algorithm  in  the  4-block  case,  we  generated  a  1 -parameter  family  of 
68  x  68  matrices  that  had  entries  linear  in  the  Cj  variables.  There  were  6  relations  in  the  gen¬ 
erator  set,  so  the  rank  of  the  matrix  was  generically  62.  The  horizontal  axis  of  the  plot  is  the 
so-called  mu-parameter.  The  largest  mu-parameter  value  on  the  plot  where  the  the  matrix 
looses  rank  is  V^max  =  M-(M).  Note  that  there  appear  to  be  only  two  nonzero  mu-parameters 
where  the  matrix  looses  rank  -  this  observation  is  significant  because  it  means  that  the  major¬ 
ity  of  the  62  roots  of  the  associated  polynomial  are  probably  complex,  leaving  what  could  be 
a  low  order  invariant  factor  that  contains  the  sought  point  R^^  (if  more  roots  were  real  we 
would  expect  to  see  many  dips  in  the  plot). 

The  second  plot  on  the  page  shows  the  value  of  the  singular  value  #63.  Note  that  it  is  more 
than  10  orders  of  magnitude  smaller  than  the  average  value  of  singular  value  #62.  This  gap 
indicates  a  clear  boundary  where  the  matrix  becomes  rank-deficient.  The  third  plot  shows  the 
two  singular  values  plotted  to  the  (logarithmic)  scale  for  direct  comparison  of  magnitudes. 

Finally,  the  last  plot  shows  a  blowup  of  the  region  of  interest  near  the  numerical  zero  of 
singular  value  #62.  For  comparison,  the  lower  bound  program  mup  gave  a  lower  bound  of 
0.8723  and  an  upper  bound  of  1.000  for  )i(M). 

The  constructive  algorithm  required  roughly  6  SPARC  second  to  evaluate  the  68  singular 
values  at  each  mu-parameter  value  on  the  plot.  For  a  typical  plot  consisting  of  200  points,  the 
time  required  to  evaluate  a  4-block  mu  value  exactly  is  just  over  10  minutes.  This  is  more 
than  100  times  slower  than  mup  requires  to  provide  the  upper  and  lower  bounds. 
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4.  Observations  and  Loose  Ends 


Theorem  3.3  in  the  previous  section  is  an  important  break  though  for  the  p-theory.  It  provides 
a  theoretical  algorithm  for  computing  p(M),  and  it  led  to  the  functional  algorithm  we  now 
have  on  the  computer  that  can  solve  the  four-block  diagonal  problem. 

That  theorem  alone,  however,  does  not  provide  a  practical  solution  for  the  diagonal  p-problem 
of  arbitrary  size.  The  computational  method  we  have  implemented  does  not  generalize  easily 
to  arbitrary  dimensions.  There  are  related  theoretical  developments  still  in  progress,  however, 
that  could  improve  this  situation.  This  section  is  mainly  concerned  with  other  research  in  pro¬ 
gress  that,  together  with  Theorem  3.3,  could  bring  us  to  a  more  complete  understanding  of  the 
p-problem. 

One  point  worth  mentioning  right  away  is  we  are  not  sure  whether  Ssmg  is  a  good  set  of 
points  to  work  with.  We  have  worked  only  a  few  examples  in  the  case  of  2,  3  and  4  scalar 
blocks,  and  little  is  known  in  general  about  the  structure  of  S  and  how  Ssine  sits  inside  it.  It 
seems  likely  that  the  theory  would  improve  immensely  if  the  global  structure  of  some  exam¬ 
ples  were  worked  out  in  detail.  That  is  the  long-term  goal  of  the  research  described  in  this 
section. 

Observation  4.1:  Much  of  the  theory  presented  in  Section  3  was  developed  in  a  different 
framework  by  Andy  Packard.  In  his  Ph.  D.  thesis,  he  derived  a  set  of  polynomial  conditions 
that  he  used  to  define  ji-values  [Packard  1].  There  is  a  clear  correspondence  between  his  poly¬ 
nomials  and  those  in  Theorem  3.2.  The  iterative  lower-bound  algorithm  that  Andy  developed 
[Packard  2]  is  based  on  a  decomposition  that  makes  use  of  a  stationarity  condition  of  an  asso¬ 
ciated  gradient  function. 


Observation  4.2:  There  could  be  a  better  statement  of  the  result  in  Theorem  3.3  more  closely 
related  to  both  the  computational  methods  and  the  geometric  structure.  For  robust  computation 
we  choose  to  compute  the  points  of  Ssulg  by  means  other  than  polynomial-root  finding.  As 
mentioned  in  Remark  3.7  at  the  end  of  section  3,  the  earliest  numerical  test  of  the  theory  on  3 
and  4  block  examples  did  not  rely  on  finding  the  roots  of  the  polynomial  g  (we  did  not  even 
compute  the  polynomial  g). 


Observation  4.3:  The  reader  should  note  that  the  methods  used  to  obtain  the  polynomial  con- 
ditions  in  the  n-block  case  in  Section  3  are  different  from  those  used  in  the  2-block  case 
presented  in  Section  2.  In  Remark  2.1  an  alternative  criterion  for  pairs  of  indefinite  2x2 
matrices  was  stated,  and  our  initial  approach  to  the  general  theory  was  based  on  a  generaliza¬ 
tion  of  that  alternative  condition.  We  state  the  (incorrect)  generalization  here. 

Incorrect  General  Statement  4.1:  Let  IF  j  =  1,  •  •  •  ,  n  be  a  set  of  indefinite  nxn  Hermitian 
forms.  Define  Z(HJ)  to  be  the  zero  set  of  IF  viewed  as  a  quadratic  form  on  Cn: 


Z(BF)  =  [y  e  Cn  I  y*  IF  y  =  0}  . 


(4.1) 


Then  one  of  two  alternatives  holds: 


1)  There  exist  real  numbers  tj,  •  •  •  ,  t„  such  that  Z  t;  IF  is  definite 

j=i 


2)  The  intersection  of  Z(Hj)  over  all  j  contains  a  nonzero  point. 
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This  statement  is  true  for  n=2  but  not  true  for  n>4.  The  incorrectness  of  this  general  statement 
for  n  >  4  is  linked  to  the  gap  between  p(M)  and  inf{3(D  M  D'1)  I  D  £  GLA]  (the  upper 
bound  found  by  Doyle  in  the  original  paper  [Doyle]  in  which  p  was  defined).  The  approach 
in  Section  2  might  be  extended  to  higher  n  as  a  way  to  relate  the  upper  bound  to  p.  Finding 
an  exact  bound  on  the  size  of  the  gap,  even  for  the  case  n  =  4,  would  be  another  significant 
breakthrough. 

Loose  End  4.1:  So  far  we  have  not  directly  referred  to  invariants.  In  fact,  much  of  this 
approach  was  motivated  by  the  idea  that  invariant  theory  could  uncover  some  practical  results 
for  the  numerical  solution  of  the  problem.  The  function  det(H(t))  is  one  type  of  invariant 
function,  there  are  other  invariants  as  well. 

Only  recently  have  we  started  looking  at  other  invariants  associated  with  the  p -problem.  The 
2-block  case  is  trivial,  the  3-block  case  is  already  hard  but  it  appears  tractable.  Our  status  in 
these  two  cases  is  presented  below. 

The  two-block  case:  When  n  =  2  the  polynomial  det(H(t))  is: 


det(H(t!,  t2))  -  c2o  ti2  +  Cj j  tjt2  +  Cq2  t2  . 


(4.2) 


The  condition  that  the  gradient  of  det(H(t))  vanish  at  t°  is: 

=  2c20  tf  +  cn  t2  =  0  ,  ^  ^  =  c„  tf  +  2c02  t2  =  0  .  (4.3) 

These  two  equations  can  be  satisfied  for  a  nonzero  t°  only  if  the  discriminant  function  A 
defined  by 


3(det(H(t°)) 


A(c2o>  Cjj,  c02)  -  4  c20c02  -  c2j  -  0  .  (4.4) 

Recall  that  the  coefficients  Cj:  are  polynomial  functions  of  — .  The  polynomial  g(— )  referred 

to  in  Theorem  3.3  is  (functionally  equivalent  to)  the  polynomial  A  when  it  is  evaluated  as  a 
function  of  r. 

The  three-block  case:  When  n  =  3  the  polynomial  det(H(t))  is: 

det(H(t))  =  c^io  t2tj  +  C2oi  t2t3  +  Cj2q  tjt2  +  (4.5) 

clll  tlt2t3  +  c102  lll3  +  c021  t2t3  +  c012  l2l3  • 


The  problem  is:  find  a  set  of  polynomial  functions  of  the  coefficients  cijlc,  analogous  to  the 
discriminant  function  A  for  the  two-block  case,  such  that  the  vanishing  of  those  functions  is 
equivalent  to  the  condition  that  the  gradient  of  g  vanish  at  a  point  where  at  least  2  of  the 
components  tj°  are  nonzero  (we  may  assume  that  two  tj°  are  nonzero  because  of  the  non- 
vacuous  assumption  in  Theorem  3.3). 

It  is  possible  that  the  trace  of  H(t)  and  the  sum  of  second  principal  minors  will  help  solve  the 
problem  (there  is  a  discriminant  for  cubic  polynomials  in  two  variables  that  include  these 
quantities). 
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The  technique  developed  by  mathematicians  to  solve  this  type  of  problem  is  invariant  theory, 
the  solution  may  be  found  (in  principle)  by  analysis  of  the  induced  representations  of  the  gen¬ 
eral  linear  group  GL(n)  acting  on  degree-r  homogeneous  polynomials  in  t.  All  the  basic 
invariants  can  be  determined  by  classical  methods  (see  [Weyl]). 

Though  this  theoretical  answer  is  appealing,  it  is  never  clear  in  practice  what  invariants  to 
compute.  The  recipe  indicated  in  [Weyl]  is  to  use  the  so-called  symbolic  method,  a  technique 
that  can  be  used  to  generate  all  the  invariants  of  a  given  fixed  degree.  But  then,  he  goes  on  to 
say: 


"Great  as  this  accomplishment  is,  one  ought  to  point  out,  however,  that  the 
method  is  far  from  reducing  the  construction  of  a  finite  integrity  basis  for  form 
invariants  to  the  same  for  vector  invariants.  For  the  number  of  symbolic  vector 
arguments  u1,  •  •  •  ,  uv  we  have  to  introduce  [during  the  invariant  construction 
process]  is  dependent  on  the  degree  of  J(u)  [the  invariant  function],  and  we  must 
have  an  unlimited  supply  of  such  symbols  at  our  disposal  when  we  are  to  take 
into  account  invariants  J  of  all  possible  degrees." 

[Weyl,  p.  244] 

In  addition,  there  are  general  problems  for  which  the  full  solution,  even  if  computable,  might 
be  impractical  to  implement  in  a  computer  program. 

For  the  system  of  polynomials  in  Theorem  3.3,  however,  there  is  some  hope  for  a  satisfactory 
solution  in  low  dimensions.  At  least  in  the  three  and  four  block  cases  the  invariant  computa¬ 
tions  are  of  a  small  enough  size  that  they  can  be  performed  by  hand. 

For  example,  one  of  the  simplest  low-degree  invariants,  the  Hessian 


X(t)  =  det 


a2det(H(t)) 

atjBtj 


(4.6) 


(a  covariant  of  weight  2,  see  [Weyl,  p.  240])  provides  a  large  set  of  nontrivial  invariants  to 
work  with.  Note  that 


X(t)  =  .  I  .  fi„...  ,iD(Cj)  tj1  •••t'-  (4.7) 

lU  *  *  ' 

is  a  polynomial  in  t  homogeneous  of  degree  n(n-2)  in  t  with  coefficient  functions  fi(i . . .  ^(Cj) 
that  are  homogeneous  of  degree  n  in  the  Cj  (J  is  a  multi-index,  c.g.  J  =  2022  for  the 
coefficient  C2022  when  n  =  4).  The  summands  in  equation  4.7  are  assumed  to  have  been  sym¬ 
metrized,  with  the  summation  running  over  all  indices  ij,  •  •  •  ,i„  such  tint  I  i„  =  n(n-2). 

j=i 

Our  objective  is  to  find  relations  among  sets  of  invariants  that  hold  whenever  the  value  of  r 
they  depend  on  lies  in  S*m8  (see  Theorem  3.2).  (Note:  an  expression  for  the  Hessian  matrix 
and  its  determinant  has  been  computed  directly  from  the  expression  in  equation  3.31.  There 
might  be  symmetries  but  we  have  no  definite  results  yet). 

Now  we  already  knew  that  there  is  a  single  polynomial  g(— )  that  these  coefficients  had  to 
satisfy:  it  can  be  generated  by  Sylvester’s  determinants  in  elimination  theory  (see  [Van  der 
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Waerden]).  The  problem  with  Sylvester  determinants  is  the  degree  of  the  resultant  polynomial 
grows  in  size  rapidly  as  n  increases:  the  resultant  degree  is  (generically)  the  product  of  the 
degrees  of  the  individual  polynomials.  For  the  system  of  polynomials  in  Theorem  3.3  there 
are  n  equations  each  of  degree  n-1  —  the  growth  in  degree  is  asymptotically  nn. 

One  good  reason  for  believing  there  should  be  invariant  factors  for  Sylvester’s  determinant 
polynomial  is  the  existence  of  Constructive  Algorithm  3.1.  It  is  just  a  matter  of  identifying  the 
irreducible  representation  of  the  general  linear  group  associated  with  that  construction  and  we 
will  have  an  invariant  relation  to  work  with. 

Very  recently,  we  checked  to  see  what  the  Hessian  polynomials  look  like  for  the  case  of  n=3. 
In  all,  there  are  10  polynomials  of  degree  3  in  the  7  nonvanishing  coefficients  of  the  polyno¬ 
mial  det(H(t))  in  equation  4.5.  Two  of  the  10  quantities  (generalized  discriminants)  are 
presented  in  equation  4.10. 


clllc210c201  ~  c120c201  -  c102c210  (4.10a) 

c210  C201C021  ~  4  C210C012  -  4  Ci20c102  +  ^  CHl)  (4.10b) 

Equation  4.10a  is  the  coefficient  of  tj5  and  4.10b  is  the  coefficient  of  tft2  in  the  polynomial 
X(t).  There  are  two  quantities  similar  to  4.10a  and  five  others  similar  to  4.10b  (the  six  rela¬ 
tions  of  type  4.10b  seem  to  be  subdivided  into  two  groups  of  three  according  to  orientation). 
There  is  one  other  quantity  of  a  type  not  shown  -  that  one  arises  from  the  coefficient  of  tit2t3 
in  X(t). 

We  have  not  solved  the  problem,  but  we  can  state  precisely  what  we  hope  for:  THE  BEST 
POSSIBLE  RESULT. 

THE  BEST  POSSIBLE  RESULT:  For  the  three-block  problem,  the  best  possible  result  would 
be  to  find  a  small  set  of  low-degree  invariants,  similar  in  form  to  those  in  equation  4.10a  and 
4.10b,  that  generate  the  full  ring  of  invariant  functions.  It  would  then  follow  that  the  Sylvester 
determinant  polynomial,  being  an  invariant  polynomial,  would  lie  in  the  polynomial  ideal  gen¬ 
erated  by  those  polynomials.  For  the  examples  in  equation  4.10  each  coefficient  Cj  is  a  poly¬ 
nomial  of  degree  3  in  the  parameter  — ,  that  would  therefore  lead  to  generators  of  order  no 
greater  than  9. 

If  such  a  result  could  be  found  then  the  Sylvester’s  determinant  could  be  factored  into  primi¬ 
tive  invariants  and  then,  to  compute  )t(M),  one  needs  only  find  all  the  real  roots  of  the  factor 
polynomials,  call  the  largest  one  R,^,  then  n(M)  =  VRmax- 

We  emphasize  that  we  have  not  achieved  the  best  possible  result,  nor  do  we  feel  totally 
confident  that  we  can  solve  it.  On  the  other  hand  we  have  good  reason  to  expect  some  pro¬ 
gress.  Some,  hope  seems  justified  because  the  problem  originated  from  a  system  of  quadratic 
forms  yj*  H*y*  =  0,  and  problems  involving  sets  of  quadratic  forms  have  been  solved  in  the 
past  [Bromwich].  We  have  only  seen  results  for  one  parameter  families,  however  (our  prob¬ 
lem  is  exactly  the  one  Bromwich  treats,  but  for  n-parameter  families  and  lower  rank  forms). 

It  is  known  that  there  is  a  finite  set  of  polynomial  generators  and  the  classical  theory  provides 
exhaustive  methods  for  finding  such  a  set.  Even  so,  the  problem  of  demonstrating  an  explicit 
set  of  generators  is  usually  not  easy.  Fortunately,  there  is  another  possibility: 

AN  EASIER  RESULT  OF  EQUAL  PRACTICAL  VALUE:  There  is  a  deterministic  process 
that  can  be  used  to  tell  in  finite  time  whether  a  given  set  of  invariants  is  good  enough  to  solve 
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the  (i-problem.  For  our  application  it  would  be  sufficient,  and  no  loss  in  terms  of  practical 
value,  if  we  could  find  any  set  of  low  order  invariants  that  factoize  the  Sylvester  polynomial. 
This  restriction  allows  us  to  reformulate  the  problem  in  very  explicit  terms:  using  a  specific 
set  of  low  order  invariant  polynomials,  generate  a  complete  linearly  independent  set  (over  the 
real  numbers)  of  invariant  polynomials  of  degree  less  than  or  equal  to  the  degree  of  the  Syl¬ 
vester  polynomial.  Determine  whether  the  Sylvester  polynomial  lies  in  the  span  of  this  set. 
(Besides  those  invariants  already  mentioned,  a  reasonable  set  of  invariants  to  pick  for  this 
application  are  those  that  can  be  generated  by  the  symbolic  method  [Weyl]  applied  to  the 
form  det(H(t)).)  By  restricting  attention  to  this  specific  problem  we  avoid  the  general  problem 
that  might  be  too  hard  to  solve.  This  problem  for  the  case  n=3  is  certainly  not  too  hard  —  we 
have  already  demonstrated  a  constructive  algorithm  that  provides  a  polynomial  of  degree  36 

(and  degree  272  for  n=4)  in  —  that  is  a  sufficient  condition  that  r  be  in  S5mg.  If  possible,  we 

would  like  to  find  a  minimal  set  of  lower  order  polynomials  (e.g.  two  of  order  18  for  n=3)  to 
do  the  same  job. 

It  is  worth  noting  that  there  are  computer  programs  that  generate  and  work  with  polynomial 
invariants  using  symbolic  manipulation.  The  formulas  4.10a  and  4.10b  were  computed  by 
hand  in  about  one-hour’s  time.  That  same  computation  could  be  performed  by  computer  much 
faster  (and  with  fewer  errors  at  intermediate  stages).  A  computer  could  also  handle  larger 
problems  it  could  generate  the  Hessian  invariants  for  the  four,  five,  and  six  bock  problems,  for 
example.  Moreover,  there  are  deterministic  algorithms  for  taking  the  (moderately  large)  set  of 
invariants  generated  in  this  way  and  reduce  it  to  a  minimal  set  of  generators.  Finally,  a  com¬ 
puter  could  be  used  to  determine  whether  a  set  of  low-order  invariants  generate  the  Sylvester 
polynomial.  Manual  -  xis  should  be  adequate  to  perform  these  tasks  for  the  three  and  four 
block  problems. 

That  summarises  where  we  are  so  far  on  invariants.  In  the  near  future  we  hope  to  determine 
at  least  in  the  case  n=3  what  the  primitive  factors  of  Sylvester’s  determinant  polynomial  are. 
Surely  this  problem  has  been  worked  on  before  [we  have  some  leads,  but  no  references  yet]. 
The  vaiiety  Z(det(H))  for  n=3  is  a  singular  cubic  curve  in  P2,  and  for  n=4  it  is  a  singular 
quartic  surface  in  P  .  These  two  low-dimensional  cases  are  reasonably  well  understood  by 
algebraic  geometers  [Hartshome].  It  looks  like  algebraic  geometry  and  the  invariant  polynomi¬ 
als  could  help  us  understand  the  low-dimensional  problem  and  possibly  lead  us  to  a  more 
efficient  computational  algorithm.  To  meet  the  improved  efficiency  challenge  for  n=4,  it  has  to 
beat  roughly  100  subroutine  calls  to  the  UNPACK  routine  dsvd  with  a  62  x  62  real  matrix. 
An  important  practical  goal  is  to  find  the  real  solutions  efficiently  by  minimizing  the  computa¬ 
tional  overhead  of  filtering  out  the  imaginary  ones.  For  the  few  examples  we  have  tested  there 
seem  to  be  relatively  few  real  solutions,  given  the  order  of  the  underlying  polynomials. 

Last  Minute  Remarks:  Given  more  time,  we  would  have  reworked  the  Canonical  Form 
Theorem  3.1  so  that  the  matrix  U  would  be  uniquely  determined.  For  points  in  Ssinfi  we  now 
know  how  to  do  that,  the  approach  is  roughly  as  follows: 

At  r  e  Ssing  the  Singularity  Condition  implies  that  a  U  can  be  chosen  so  that 

V,  [tf  (ai  -  pi)]  (4.11) 

(Tn_j  from  equation  3.10)  is  a  real  vector  for  all  j.  Furthermore,  assuming  the  matrix  rank  is 
compatible,  the  complex  coefficient  matrix  can  be  normalized  so  that  the  n-1  x  n-1  matrix: 

T„_i  [k2  (a2  -  p2)  •  •  •  kn  (an  -  pn)]  (4.12 
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is  the  identity  matrix.  The  phase  of  the  vector  k  can  be  normalized  so  that  the  nonzero  entry 
corresponding  to  the  smallest  index  is  positive.  (Of  course,  there  are  exceptional  cases  where 
a  different  parametrization  of  a  similar  type  must  be  used).  Given  these  extra  conditions,  the 
canonical  forms  are  reduced  to  their  essential  moduli. 

Looking  back  at  the  3-block  problem,  we  see  that  this  true  canonical  form  leads  to  a  manage¬ 
able  parametrization  of  the  polynomial  det(H(t))  in  terms  of  aJ,  andk.  With  all  the  free 
parameters  absorbed  into  U,  the  polynomial  expression  can  be  parametrized  directly  by  11 
real  parameters  (the  c^  then  become  functions  of  these  1 1  geometric  parameters).  The  dimen¬ 
sion  count  breaks  down  as  follows:  only  two  real  parameters  are  left  in  the  vectors 
k->  (cd  -  (30;  and  the  only  other  parameters  needed  are  9  more  for  the  real  pan  of  the  2  x  2 
matrix  of  cofactors  in  the  lower  right  principal  minor  of  the  canonical  form  (the  imaginary 
part  of  the  cofactor  matrix  does  not  contribute  to  the  determinant  when  the  kJ  (od  -  (V)  vectors 
are  real).  Of  these  11,  3  disappear  immediately  because  of  the  vanishing  coefficients  of  the  tj3 
terms.  Thus  the  seven  algebraic  coefficients  c^  are  conveniently  parametrized  by  the 
geometric  moduli  fairly  efficiently.  The  goal  in  this  case  would  be  to  generate  two  more 
independent  relations  that  must  hold  for  r  £  Ssmg. 

With  the  more  efficient  canonical  form,  perhaps  the  three-block  problem  can  be  solved  by  a 
direct  computational  approach. 


Final  Note:  The  author  noticed  the  following  just  before  the  final  deadline: 


det(H(t))  =  qij(t)  (q  -  tfXtj  -  tj°)  .  (4.13) 


The  polynomials  q^t)  of  degree  n-2  are  not  necessarily  homogeneous.  This  globally  valid 
expression  for  the  polynomial  in  t  has,  in  its  expansion,  expressions  of  varying  total  degree  in 
t,  yet  it  is  known  a  priori  to  be  homogeneous  of  degree  n.  Consequently,  there  is  a  set  of  n-1 
nontrivial  conditions  Rk,  one  for  each  positive  integer  k  <  n,  defined  on  die  coefficients  of  the 
polynomials  q^t),  that  must  hold  if  the  homogeneous  polynomial  of  degree  n-k  in  t  is  to  van¬ 
ish. 

These  conditions  Rk  are  the  key  to  THE  EASIER  RESULT  OF  EQUAL  PRACTICAL 
VALUE  -  they  should  le^d  to  a  primary  decomposition  of  the  polynomial  ideal  generated  by 

the  gradient  polynomials  -  . 

dtj 

For  k=l,  the  condition  RJ  will  be  generated  by  a  set  of  invariants  R,1,  each  of  which  is  the 
determinant  of  an  n  x  n  square  matrix  A,1  linear  in  the  coefficients  of  the  polynomials  q^t). 
The  condition  R1  is  satisfied  only  if  all  these  determinants  vanish. 

For  higher  k  the  picture  is  not  so  clear,  but  it  looks  as  if  Rk  is  also  generated  by  a  set  of 
determinant  functions:  this  time  for  matrices  A k  of  size  equal  to  sk  =  -  - , 

the  dimension  of  the  space  of  homogeneous  polynomials  of  degree  k  in  n  variables.  Each 
matrix  Ak  will  have  entries  that  are  polynomial  functions  (of  degree  <  k?)  in  the  coefficients 
of  the  polynomials  q^t). 

We  believe  can  solve  THE  EASIER  RESULT  OF  EQUAL  PRACTICAL  VALUE  by  the 
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following  algorithm: 


1)  For  k=l,2  map  the  coefficients  of  the  Hessian  matrix  polynomials 


32det(H) 
dtjdtj 


into  the 


matrices  Ajk  for  the  two  values  k=n-l  and  n-2  (the  Hessian  polynomials  are  homogeneous  of 
degree  n-2,  so  lower  order  expressions  in  the  expansion  4.13  vanish  identically). 


2)  Evaluate  the  determinants  of  Ak  to  obtain  the  invariant  functions  4>k(c)  ,  where  c  is  the 
vector  of  coefficients  C;  (J  a  multiindex  of  total  degree  n)  of  the  original  function  det(H(t)). 


3)  Consider  the  set  of  invariant  polynomials  <j>k(c).  It  looks  like  the  degrees  are  n  (when  k=l) 
and  n(n+l)  (when  k=2),  so  the  evaluation  of  these  invariants  is  a  reasonable  numerical  task 
(compared  with  nn.  The  value  r  on  which  c  depends  is  in  Ssmg  (see  Theorem  3.3)  if  and  only 
if  all  the  invariants  <j>k(c)  are  zero. 


We  will  investigate  this  idea  further. 


Notes  for  Section  4: 

For  background  in  invariant  theory  the  reader  is  referred  to  [Weyl].  An  older,  more  elemen¬ 
tary  book  on  invariants  of  (families  of)  quadratic  forms  is  [Bromwich].  In  some  respects  the 
approach  we  have  taken  here  is  an  attempt  to  generalize  the  problem  discussed  in  Bromwich’s 
book  but  we  are  (so  far)only  working  with  families  of  rank-2  forms.  An  introduction  to  the 
modem  theory  of  Algebraic  Geometry  is  available  in  [Hartshome].  A  brief  and  interesting 
elementary  chapter  on  real  algebraic  geometry  can  be  found  in  [Milnor],  Finally,  for  anyone 
interested  in  working  along  these  lines  who  is  not  familiar  with  the  history  of  algebraic  prob¬ 
lems,  we  recommend  [Dieudonne]  as  a  general  historical  survey  and  the  article  by  [Kleiman] 
as  a  brief  survey  of  the  related  subject  of  Schubert’s  calculus.  We  did  not  have  a  chance  to 
discuss  here  the  Schubert-cycle  interpretation  of  the  Constructive  Algorithm  3.1  —  that  will 
have  wait  until  later. 
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